Information content of sets of biological sequences revisited

نویسندگان

  • Alessandra Carbone
  • Stefan Engelen
چکیده

To analyze the information included in a pool of amino-acid sequences, a first approach is to align the sequences, to estimate the probability of each amino-acid to occur within columns of the aligned sequences and to combine these values through an ”entropy” function whose minimum corresponds to absence of information, that is to the case where each amino-acid has the same probability to occur. Another alternative is to construct a distance tree between sequences (issued by the alignment) based on sequence similarity and to properly interpret the tree topology so to model the evolutionary property of residue conservation. We introduced the concept of ”evolutionary content” of a tree of sequences, and demonstrated at what extent the more classical notion of ”information content” on sequences approximates the new measure and in what manner tree topology contributes sharper information for the detection of protein binding sites.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A computational method to analyze the similarity of biological sequences under uncertainty

In this paper, we propose a new method to analyze the difference and similarity of biological sequences, based on the fuzzy sets theory. Considering the sequence order and some chemical and structural properties, we present a computational method to cluster the biological sequences. By some examples, we show that the new method is relatively easy and we are able to compare the sequences of arbi...

متن کامل

THE ENTROPIES OF THE SEQUENCES OF FUZZY SETS AND THE APPLICATIONS OF ENTROPY TO CARDIOGRAPHY

In this paper, rstly we have introduced to entropy of sequences of fuzzy sets and given sometheorems about it. Secondly, the waves P and T which appears in electrocardiograms weretransferred to fuzzy sets, by using denition of entropy for sequences of fuzzy sets, and somenumerical values were obtained for sequences of waves P and T. Thus any person can makea medical predictions for some cardiac...

متن کامل

Wijsman Statistical Convergence of Double Sequences of Sets

In this paper, we study the concepts of Wijsman statistical convergence, Hausdorff statistical convergence and  Wijsman statistical Cauchy double sequences of sets and investigate the relationship between them.

متن کامل

Subgeneric classification of Linaria (Plantaginaceae; Antirrhineae): molecular phylogeny and morphology revisited

Linaria Mill. (Plantaginaceae) with about 160 spp. is the largest genus of the tribe Antirrhineae. We conducted phylogenetic analyses of nuclear ribosomal DNA internal transcribed spacer region (ITS) and chloroplast DNA (rpl32-trnL) sequence data to test the monophyly of currently recognized sections in Linaria. For this purpose 86 species representing seven sections of Linaria and one species ...

متن کامل

Sweep Line Algorithm for Convex Hull Revisited

Convex hull of some given points is the intersection of all convex sets containing them. It is used as primary structure in many other problems in computational geometry and other areas like image processing, model identification, geographical data systems, and triangular computation of a set of points and so on. Computing the convex hull of a set of point is one of the most fundamental and imp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008